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Estimating correlation and covariance matrices by weighting of 
market similarity 

Michael C. Munnix, 1 ^ Rudi Schafer, 1 and Oliver Grothe 2 

1 ' Department of Physics, University of Duisburg- Essen, Duisburg, Germany 
^Department of Economic and Social Statistics, University of Cologne, Germany 

We discuss a weighted estimation of correlation and covariance matrices from historical financial data. To 
this end, we introduce a weighting scheme that accounts for similarity of previous market conditions to the 
present one. The resulting estimators are less biased and show lower variance than either unweighted or 
exponentially weighted estimators. 

The weighting scheme is based on a similarity measure which compares the current correlation structure of 
the market to the structures at past times. Similarity is then measured by the matrix 2-norm of the difference 
of probe correlation matrices estimated for two different times. The method is validated in a simulation study 
and tested empirically in the context of mean-variance portfolio optimization. In the latter case we find an 
enhanced realized portfolio return as well as a reduced portfolio volatility compared to alternative approaches 
based on different strategies and estimators. 

Keywords: Weighted Correlation Estimation; Covariance Estimation; Time-dynamic Dependence; Mean- 
Variance Portfolio Optimization 



I. INTRODUCTION 

Good estimates of the correlation and covariance ma- 
trices of financial returns are central for a wide range 
of applications such as risk management, option pricing, 
hedging and capital allocation. For example in risk man- 
agement applications, they directly affect the calculation 
of the value at risk or the expected shortfall. In the con- 
text of capital allocation, the correlation structure is key 
in the classical portfolio optimization problem, as shown 
in the seminal work of Markowitz (19521. 

Generally, the quality of the estimated matrices in- 
creases with the length of the time series, i.e., the amount 
of data used. For small datasets the matrices have a large 
variance and may even be singular or indefinite. In fi- 
nancial context, however, using long time series results 
in biased estimates of the correlation structure, since the 
dependence of asset returns is not constant in time (see, 
e.g., King and Wadhwani (1990) for an early review). 

standard estimators equally 
By consequence, out- 



mators like the RiskMetrics estimators (see, e.g., Longer- 
staey a nd Spencer ( 1996b) or the estimators discussed in 
Lee and Stevenson| ( |2003[ ). Since these estimators only 



use a small part of the data, they show a large variance. 
Moreover, whenever the number of effectively used ob- 
servations is not large compared to the dimensions of the 
time series, estimated correlation and covariance matri- 
ces may be regarded as completely random. |Laloux et al.\ 
( 1999 ) showed in an empirical example that in such cases 
94% of the spectrum of estimated correlation matrices 
equal the spectrum of random matrices and only their 
largest eigenvalues may be estimated adequately. 

Solutions to this problem involve reducing the dimen- 
sionality of the problems by imposing some structure on 
the correlations, e.g., by using factor models or shrink- 
age estimators as in Ledoit and Wolf] ( 2004[ ) or by noise 
reduction technique s, e.g., Random Matrix Filtering (see 



Plerou et al. ( 2002 )) or Power Mapping (Schafer et al. 



The problem is that 
weight all parts of the dataset. 
of-date and improper information highly affect the esti- 
mates. This paper tackles this problem by introducing a 
new weighted estimator of the correlation or covariance 
matrix. This estimator makes use of enough data to ad- 
equately limit its variance but - in order to minimize its 
bias - focuses only on parts of the data where the market 
is in similar market conditions, i.e., it exhibits the same 
correlation structure. 

To reduce the effects of time changing structures, com- 
mon approaches in the literature choose time intervals 
where the structures are approximately constant. Exam- 
ples of such approaches are exponentially weighted esti- 



(2009)). Other approaches reduce the dimensionality by 
using conditional models of the correlation matrices going 
back to the work of Bollerslev ( 1986 ) . A short overview 



of these practices may be found in Andersen et al. ( 2007 ) . 
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With the availability of intraday high frequency finan- 
cial data, it was expected that finer sampled data would 
effectively enlarge the datasets and improve estimates of 
parameters. However, when return data is observed on 
shorter time intervals, it is contaminated by market mi- 
crostructure effects. These effects influence estimators 
and induce bias and no ise (see, e.g., for a recent discus- 
sion Bandi et al. (2008)). Possible reasons include asyn- 
chrony and decimalization effects (see, e.g 
Q2010a|b| )). 

Since the amount of data for the estimation may only 
be increased by either considering a longer time period or 
by sampling on higher frequencies, the mentioned prop- 
erties of financial time series limit the amount of usable 
data. Longer time intervals bias the estimators due to the 
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time changing nature of the matrices. Higher frequencies 
intensify the effects of the market microstructure on the 
estimators. 

In this paper, we circumvent these limits. We pro- 
pose to enlarge the amount of usable data by adaptively 
including different parts of the time series with similar 
correlation structures into the estimator. We therefore 
introduce a similarity measure which measures the de- 
gree of similarity between days of the time series based on 
probe correlation estimations. We demonstrate the ap- 
plication of the measure on assessing similarities on stock 
returns from the S&P 500 index. The measure reliably 
detects regime changes in the data as well as the special 
market situation during the financial crisis in 2008. 

The similarity measure enables us to construct a 
weighting scheme for correlation or covariance estima- 
tors that attaches high weights on similar parts of the 
data and suppresses distortions. In a simulation study, 
we demonstrate that these similarity weighted estimators 
show smaller bias and variance than unweighted or expo- 
nentially weighted estimators. The results hold for con- 
stant as well as for dynamic correlation structures in the 
data. In a real data application we apply our estimator 
to covariance estimation in the context of mean- variance 
portfolio optimization. We use time-series of stocks from 
the S&P 500 index and randomly choose stocks to build 
up portfolios. We show that optimal portfolios which 
are based on the similarity weighted covariance estimator 
outperform alternative approaches with respect to real- 
ized volatility and realized return. 

The paper is organized as follows. Secti on |n] intro- 
duces the measure of similarity. In section |III[ a simi- 
larity based weighting scheme for estimators of correla- 
tion or covariance is constructed. Section IIVI contains a 
simulation study analyzing variance and bias of the re- 
sulting estimators. In section [V] we empirically apply 
the estimators in the context of mean-variance portfolio 
optimization. Section |Vl| concludes. 



Similarity in correlation structure 



11. MEASURING MARKET SIMILARITY 

We measure the degree of similarity £ in the market's 
correlation structure by the norm of the difference of the 
correlation matrices C(t\) and Cfa) of the times ti and 

1 2 , i.e., 



C(t u t 2 ) = l|C(tl)-C(*2) 



(1) 



where | |C| I2 represents the induced matrix 2-norm of the 
real valued matrix C, which is the square root of largest 
eigenvalue of the matrix C'C. 

The correlation matrices C(ii) and Cfa) are esti- 
mated on a backward-looking rolling window of length 
L. The window length L will be indicated by a super- 
script, i.e., £ L . If outliers are present in the data, the es- 
timates are based on Spearman's rank correlation instead 
of Pearson's product moment correlation as this estima- 
tor is more robust to non-normal distributions. Since 
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FIG. 1. Illustration of of the £ 50 similarity measure for the 
correlation structure of the S&P 500 index from 2005 to the 
beginning of 2010. Each point of the graphic reflects the de- 
gree of similarity between the days at its coordinates. The 
dark shaded areas indicate a correlation structure that is not 
similar to any other period before or after, while the white 
areas indicate high values of similarity. The region past Oct 
2008 can clearly be identified as the beginning of the finan- 
cial crisis in 2008. Furthermore, in Feb 2007 the correlation 
structure of the assets changes. 



the estimates should be unbiased for time varying corre- 
lations, the use of small window lengths is recommended. 
As discussed in Laloux et al. (1999), this results in noisy 



estimates of the matrices and only the largest eigenval- 
ues of the matrices are adequately estimated. However, 
the similarity measure is based on the 2-norm and 
thus depends only on this largest eigenvalue, which can 
be estimated even for small values of L. 

Figure [T] illustrates the evolution of the similarity mea- 
sure £ 50 for the example of the 471 assets, that were con- 
tinuously in the S&P 500 index between Aug 2005 and 
Jan 2010. The similarity measure is evaluated for ev- 
ery day between Aug 2005 and Jan 2010 and depicted 
as a matrix. The axes represent time, therefore the evo- 
lution of the market related to a specific point in time 
is given by the upright (or vertical) intersection through 
this point. Darker regions on this intersection are less 
similar and brighter regions more similar to the situa- 
tion at the specific point in time. In this illustration, 
the the financial crisis causes a shaded area from Oct 
2008 to Mar 2009. The correlation structure in this pe- 
riod is completely different from any period before. After 
this period we find the market stabilizing: The correla- 
tion structure becomes similar to previous market states 
again. Beside the financial crisis, we can find regions for 
any point in which the correlation structure was similar 
and regions where it was different. 

Furthermore, we are able to identify a regime switch 
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FIG. 2. Mean of pairwise Spearman's correlation coefficients 
for the dataset from Sep 2008 to Nov 2009 each evaluated 
with a moving window of 50 trading days. In Feb 2007 the 
overall level of correlation increases. 



in the correlation structure at the end for Feb 2007, in- 
dicated by a shift from light to dark shared areas. This 
transition is reflected in a raised average correlation level 
which affects the measure of similarity. Figure [2] shows 
the average correlation of the 471 assets over time. Be- 
tween Feb 2007 and Apr 2007 the overall level of corre- 
lation increases, indicating the new correlation regime. 
The sharp transition on Feb 2007 was induced by large 
overall price drop of the stocks in the S&P 500. This orig- 
inated in drastic events on the Chinese stock markeiEl 



III. SIMILARITY WEIGHTED ESTIMATORS 

The similarity measure £ L may serve as a weighting 
scheme for estimators of correlation or covariance matri- 
ces. With respect to the reference point t the scheme 
inscribes high weights to periods where the market be- 
haved in a similar manner. On the other hand, the pe- 
riods in which the market behaved very differently are 
suppressed. Therefore, consider the adapted similarity 



measure 



C L (Mo) = l- 



C L (Mo) 
2(K-1) 



,te[t -T,t ], (2) 



where T is the total number of considered time steps, 
i.e., the length of the time series. The factor K refers to 
the number of assets to include. It is easily checked that 
2(K — 1) represents the theoretical maximum possible 
value of C, i-e., the highest possible dissimilarity. 

We note that the probe matrices C L in equation 
are estimated with window length L. Therefore, within 



1 See, e.g., Cover Story of Bloomberg Businessweek, Mar 12 2007: 
What The Market Is Telling Us. 



the timespan [to — L, to], they share identical values with 
the probe matrix at t = to. C, L {t, to) is then dominated by 
the amount of identical values and not by the estimated 
similarity. Therefore, the similarity measure is not reli- 
able within this region and is set to the maximum value 
of the other timespans, resulting in a corrected measure 



C L (t,t ) 



max(C L <t -L, t Q )) t € [t - L, t ] 



( L (t,t ) 



t e [t - t, t Q - E 



(?) 



A normalized weighting scheme for the estimation of the 
correlation or covariance matrix C(to) or £(£rj) at time 
t = to is then 



w 



(t,t ,L)=C L (t,t )/[ £ C L (t,to)\ , (4) 

\t=t -T ) 



resulting in the weighted estimators 

d(t )= w(t,t ,L) C L {t) and 

t=t a -T 
to 

E(t ) = ]T w(t,t ,L) E L (t) . (5) 

t=t a -T 

The superscript L again denotes the respective window 
length of the estimators. For large T and time series 
with dynamic correlation structure, the weighting scheme 
should be restricted to the s largest values of w. This 
leads to a complete suppression of dissimilar parts of the 
data. Let wr s -\ denote the s-th largest value of w. The 
restricted scheme w s is then given by 



w s (t,t ,L) = \w - 



w (s)\+/ Yl i 

t=t -T 



(6) 



with \w — iW( a )|+ = max(w(t, to, L) — W( s -s , 0). 

The unbiasedness of the estimators ([5| in time se- 
ries, where the underlying correlation matrix is constant, 
is easily checked. However, due to fluctuations of the 
weights w, their variances are expected to be slightly 
larger than for a constant non-adaptive weighting scheme 
w = l/T. These effects are explored in the simulation 
study in the next section. 



IV. SIMULATION STUDY 

The study presented here aims at the validation of the 
estimators introduced in the last section. We estimate 
correlation and compare it to standard estimators with 
respect to the bias and variance. The study consists of 
3 scenarios of normally distributed daily returns of 16 
assets. The scenarios are constructed similarly to the 



testing environments in Pafka and Kondor (2004) 



The first scenario is equicorrelation with equicorrela- 
tion parameter p = 0.7. This means that all pairwise 
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FIG. 3. Shown are the theoretical similarity matrices of the 
second and third scenario for the first 1000 trading days. In 
the left figure, the discrete regimes of the second scenario are 
clearly visible. In the right figure the similarity matrix of the 
third scenario is shown. It shows no sudden changes. 



correlations of the correlation matrix (p%j) of the 16 as- 
set returns are equal to p = 0.7 for i ^ j. In the second 
and third scenarios, the market consists of two equicor- 
related branches, the first 8 assets with equicorrelation 
parameter p\ and the second 8 assets with p%. Assets of 
two different branches are equicorrelated with equicorre- 
lation parameter pi = 0.2. The equicorrelation param- 
eters of the branches change over time, i.e., p\ = p\{t) 
and p 3 = p 3 (t). 

In the second scenario, the market switches determin- 
istically in turn between three different regimes. Each 
regime lasts 100 trading days. In regime 1, the branches 
are equicorrelated with parameters p± = 0.7 and p$ — 
0.3. In the second regime, these parameters are both 
equal to 0.5, in the third regime, they are 0.3 and 0.7, 
respectively. 

In the third scenario, the parameters pi{t) and ps(t) 
change sinusoidally with the trading days t according to: 



/M7S : :().4 + ii.:-isii) ( ^2tt 



' t — 300 

^(t) = 0.4 + 0.3 sin ( -^-2tt 



Figure [3] depicts the theoretical similarity matrices of the 
second and third scenario for the first 1000 trading days. 
The discrete regimes of the second scenario are clearly 
visible while the similarity matrix of the third scenario 
shows no sudden changes. 

The similarity weighted estimator is compared to 
benchmark estimators. The first benchmark is the stan- 
dard Pearson correlation estimator based on the last 300 
returns. As the second benchmark, we use the RiskMet- 
rics exponentially weighted correlation estimator. The 
estimator weights the j-th recent return with weight Wj . 
The weights are chosen according to 



1 - A" 
1 - A 



3=1 



We estimate the correlation matrix in all three scenar- 
ios for the days t = 1000, t = 2500 and t = 5000. To 
estimate mean and variance of the estimators, each sim- 
ulation is independently repeated 400 times. The results 
are presented in tables |l| to |III| which show the means and 
sample standard deviations of the parameters of interest 
over the 400 repetitions for the 3 estimators. 

Table [T] shows the results of scenario 1 . The parame- 
ter pij — 0.7 is estimated adequately in all cases, which 
confirms that all estimators are unbiased in this setup. 
As to be expected, the standard estimator has the lowest 
variance. It uses a constant weighting scheme. This is 
known to be optimal, when the underlying correlation is 
constant. In this setup, the adaptive weighting scheme 
of the similarity weighted estimator should also equally 
weight all observations. However, due to stochastic fluc- 
tuations the weights vary. Therefore, the variance of the 
estimator is slightly larger than the variance of the stan- 
dard estimator. The exponential weighted scheme suffers 
the highest variance as it heavily weights the most recent 
observations. This results in an unbalanced weighting 
scheme which is not optimal in this scenario. 

The results of scenario 2 are shown in table [ill The 
standard estimator is highly biased since its weighting 
scheme weights data from all 3 regimes equally. Un- 
like the standard estimator, the exponential estimator 
weights the most recent observations most and therefore 
seems unbiased. Again, its variance is the largest among 
the three considered estimators. The similarity weighted 
estimator shows variances comparable to the variance of 
the unweighted estimator but is nearly unbiased. Table 
IIIII shows the results of scenario 3. Since in this scenario 
the true parameters change continuously in time, the sce- 
nario tests if the adaptive scheme given by the similarity 
measure separates the similar regions from the dissimilar 
ones in an adequate way. The results are analogue to 
the results of scenario 2, but for the days 1000 and 2500 
also the similarity weighted and exponentially weighted 
estimators deviate from the theoretical values. However, 
they both are much closer to the theoretical value than 
the unweighted estimator. 

It is worthwhile to note that in all scenarios the bias 
of the similarity weighted estimator is similar to the bias 
of the exponentially weighted estimator. The standard 
deviation of the similarity weighted estimator, however, 
is only slightly larger than the standard deviation of the 
unweighted estimator and much smaller than the stan- 
dard deviation of the exponentially weighted estimator. 



V. APPLICATION TO FINANCIAL DATA 

In this section, we apply our estimator to financial data 
in the context of mean-variance portfolio allocation. The 
application is motivated by Engle and Colacito (2006) 



as suggested by Longerstaey and Spencer ( 1996 ) . 



who showed that the realized volatility of theoretically 
optimal portfolios is lowest if the covariance matrices 
for the optimization process are correctly specified. We 
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similarity 


unweighted 


exponential 






similarity 


unweighted 


exponential 


day p 


P °p 


P Bp 


P Bp 


day 


P 


P °p 


P °p 


P % 


1000 0.7 


0.6974 0.0364 


0.6979 0.0296 


0.6947 0.0736 




0.1402 


0.2144 0.0548 


0.4941 0.0457 


0.1927 0.1391 


2500 0.7 


0.6991 0.0403 


0.7002 0.0288 


0.6977 0.0729 


1000 


0.2 


0.1994 0.0524 


0.1997 0.0561 


0.1951 0.1374 


5000 0.7 


0.6973 0.0429 


0.7004 0.0296 


0.7022 0.0718 




0.6598 


0.5796 0.0429 


0.3058 0.0532 


0.5994 0.0869 



TABLE I. Simulation results for scenario 1, the scenario of 
constant correlation structure. Shown are the results for the 
similarity weighted, the unweighted and the exponentially 
weighted estimator. All estimators are unbiased, the expo- 
nentially weighted estimator shows the largest standard devi- 
ation. 



0.6598 0.6012 0.0478 

2500 0.2 0.2007 0.0529 

0.1402 0.1994 0.0516 

0.6598 0.6692 0.0361 

5000 o.2 0.1993 0.0553 

0.1402 0.1302 0.0469 



0.3062 
0.2007 
0.4935 
0.4955 
0.1996 
0.3030 



0.0540 
0.0559 
0.0459 
0.0457 
0.0552 
0.0539 



0.6029 
0.1992 
0.1905 
0.6767 
0.1979 
0.1221 



0.0891 
0.1363 
0.1372 
0.0804 
0.1360 
0.1397 







similarity 


unwei 


ghted 


exponential 


day 


P 


P °p 


P 


dp 


P °p 




0.7 


0.6605 0.0339 


0.4992 0.0448 


0.6911 0.0759 


1000 


0.2 


0.2007 0.0528 


0.1995 


0.0552 


0.1925 0.1350 




0.3 


0.3368 0.0498 


0.5000 


0.0442 


0.3002 0.1299 




0.7 


0.6792 0.0341 


0.4985 


0.0456 


0.6883 0.0734 


2500 


0.2 


0.2026 0.0551 


0.1996 


0.0553 


0.1975 0.1363 




0.3 


0.3199 0.0464 


0.4989 


0.0443 


0.3019 0.1288 




0.5 


0.4992 0.0492 


0.4972 


0.0448 


0.5010 0.1072 


5000 


0.2 


0.2005 0.0623 


0.1987 0.0558 


0.1947 0.1384 




0.5 


0.4994 0.0502 


0.4995 


0.0458 


0.4946 0.1076 



TABLE II. Simulation results for scenario 2, the scenario 
of discrete regimes in the correlation structure. Shown are 
the results for the similarity weighted, the unweighted and 
the exponentially weighted estimator. Clearly, the similar- 
ity weighted and the exponentially weighted estimators are 
less biased than the unweighted estimator. The exponentially 
weighted estimator shows a much larger standard deviation 
than the similarity weighted one. Note that the theoretical 
values refer to the values of the regimes of one day before the 
mentioned days. 



therefore compare realized volatility and return of vari- 
ous portfolios drawn from the S&P 500. The study shows 
that portfolios based on the similarity weighted estimator 
as discussed in this paper outperform alternative portfo- 
lios. We conclude that these similarity weighted estima- 
tors perform very well in real data applications. 

The value V of a portfolio consisting of K assets with 
prices Si and corresponding portfolio weights w, (i — 
1 . . . K) is given by 



K 



V = J2 w kS k =' s 



(7) 



fe=i 



where S refers to the (K x 1) vector of asset prices and 
w contains the respective weights. 

Consider an investment period from day t = to day 
t = T. Let S and \x be covariance matrix and the ex- 
pectation of the K asset returns over the period. Then 
portfolio variance and expectation at time t = T are 



TABLE III. Simulation results for scenario 3, the scenario 
with sinusoidally changing correlation structure. Shown are 
the results for the similarity weighted, the unweighted and 
the exponentially weighted estimator. Clearly, the similar- 
ity weighted and the exponentially weighted estimators are 
less biased than the unweighted estimator. Due to the fast 
changing structures, for the days 1000 and 2500 they devi- 
ate from the theoretical values but are much closer to the 
theoretical value than the unweighted estimator. Again, the 
exponentially weighted estimator shows a much larger stan- 
dard deviation than the similarity weighted one. 



given by 

Var(Vr) = w'Hw , 
E[V T ]=V Q (l + w'v) , 

where 1 is a vector of ones. Let SVt denote the daily 
returns of the portfolio over the investment period. Then 



t=o 



is the realized volatility of the portfolio which is a mea- 
sure of the portfolios' risk over the investment period. 
In mean- variance portfolio optimization as introduced 



by Markowitz (1952), optimal portfolio weights Wi are 
derived by minimization problems of the form 



mill ^ -to'Sw) — jw' fi 



subject to certain constraints, e.g., 

K 



Wk = i 



(8) 



(9) 



fe=i 



(budget restriction). The parameter 7 > is the in- 
vestors' risk tolerance parameter. A value 7 = denotes 
no risk tolerance. In this case, the investor's only aim 
is to minimize the portfolio variance. Large values of 7 
denote risk neutrality, i.e., the investor maximizes the 
expected portfolio return only. 

Since different investors have different risk tolerance 
levels, we focus on two special cases of the minimization 



G 



Optimization type 


Evaluation 


unweighted weighted naive 


14 day 


0.00052 0.00047 0.00124 


28 day 


0.00102 0.00090 0.00252 


56 day 


0.00231 0.00211 0.00593 



TABLE IV. Average realized risk in mean-variance portfolio 
optimization for the minimum variance portfolio and different 
evaluation windows. The last column provides a comparison 
to the naive portfolio. 



problem. We consider the minimum-variance portfolio 
(MVP), i.e., the portfolio of minimal variance without 
further constraints, and the portfolio with minimal vari- 
ance under the constraint of a fixed target portfolio re- 
turn R (TRP). The minimum- variance portfolio is the 
solution of equation ^ when 7 is set to zero, i.e., the 
investor is not risk tolerant. To obtain portfolio TRP, 7 
can easily be expressed by the target return R 



R 







(10) 



where a = l'S~V a nd /3 = l'S" 1 !. 

In a recent paper, iKritzman et al. (20101 argue that 



minimum-variance portfolios outperform various other 
strategies of portfolio optimization, even with respect to 
the their return. 



By contrast, |DeMiguel et al. (20091 raise the question 
whether portfolio optimization pays out at all. In their 
results, optimized portfolios do not significantly outper- 
form naive diversified portfolios, i.e., portfolios where the 
same amount l/n is invested in n assets. We therefore 
include this naive portfolio in our study, even though the 
naive portfolio does not depend on estimators of corre- 
lation or covariance. The portfolio strategies MVP and 
TRP allow to rank the estimators of the covariance struc- 
ture according to the portfolio performance, while the 
outcomes of the naive portfolio confirm the overall plau- 
sibility of the results. 

The basic idea of the study is to calculate optimal port- 
folios for every day of our dataset and to evaluate them 
over some investment horizon T with respect to risk (real- 
ized volatility) and return. We then compare the results 
of the different strategies and estimators. 

We use the same dataset as in section [nj i.e., the 471 
assets of the S&P 500 index that are included in the index 
from 2005 to the beginning of 2010. From this dataset, we 
randomly choose 10 portfolio constellations of 100 stocks 
each. For every trading day from Aug 2008 to Nov 2009 
we compute portfolio weights for the constellations re- 
garding to the 3 strategies. The necessary covariance 
estimates rely on the similarity weighted estimator and 
alternatively on the unweighted estimator. For the first 
estimator, we need a similarity measure, which is deter- 
mined as discussed in section [TTl The probe matrices to 
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(a) 14 day evaluation 
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(b)28 day evaluation 





(c)56 day evaluation 

FIG. 4. Average realized volatility risk for a minimum 
variance portfolio over 10 portfolio constellations. The un- 
weighted correlation matrix is compared to a weighted corre- 
lation matrix using the similarity measure £ 50 . The results 
are compared to a naive portfolio as a reference. 



calculate the similarity measure rely on moving windows 
of L = 50 trading days and are based on all 471 assets 
of the dataset. Window lengths between 30 to 70 trad- 
ing days lead to similar results. However, the results for 
window lengths around 50 seem to be quite representa- 
tive. The weighting scheme of the estimator includes the 
s = 300 most similar past days. The unweighted esti- 
mator is based on a moving window of 300 days. The 
weights of the target return portfolio rely on an addi- 
tionally specified target return R and on estimates of the 
vector fi of expected returns as well. The vector /z is es- 
timated by the returns of the portfolio's stocks for every 
trading day from a moving window of 14 trading days. 

The target return is then adaptively chosen to be 5 
percentage points above the average entree of fi. 

The evaluation results of realized volatility and returns 
are shown in Fig. [5] [4] and tables |IV| and [V] The eval- 
uation periods are 14, 28 and 56 trading days, respec- 
tively. The results shown are averages of the 10 portfo- 
lio constellations. Visual inspection of the figures shows 
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Optimization type 


Evaluation 




unweighted weighted naive 


14 day 


Risk 


0.00054 0.00048 0.00124 




Return 


-0.00391 -0.00252 -0.00103 


28 day 


Risk 


0.00108 0.00095 0.00252 




Return 


-0.00918 -0.00567 -0.00202 


56 day 


Risk 


0.00244 0.00224 0.00593 




Return 


-0.02138 -0.01547 -0.00362 



TABLE V. Average realized return and realized risk in mean- 
variance portfolio optimization for a target return of 5% above 
the market drift and different evaluation windows. The last 
column provides a comparison to the naive portfolio. 




n0 oi ^ qC p* qC p* n0 o°> rf!) rffi 

^ y& ^(£i ^o"' ^jS ^A" ^p* 



(a) 14 day evaluation 



that the naive portfolio performs worst, especially dur- 
ing the financial crisis. In that time, the incorporation of 
the covariance structure into the portfolio weights pays 
out. Realized volatility of the optimized portfolios consis- 
tently lies below the realized volatility of the naive port- 
folios whereas the similarity weighted scheme obtains the 
best results. The results are robust for the considered in- 
vestment horizons which is shown in the tables in more 
detail. 

In both cases, in the minimum variance portfolio 
(MVP) as well as in the 5% above market drift portfolio 
(TRP), the similarity weighting significantly reduces the 
realized risk. Moreover, the TRP case reveals that the 
realized return could be improved compared to the un- 
weighted optimization, although the naive portfolio fea- 
tures an even higher return. 



VI. CONCLUSION 



We introduced a measure that quantifies the similarity 
of the correlation structure for two different times. This 
measure gives a clear indication for drastic changes in the 
market structure as past the beginning of the financial 
crisis 2008. 

This measure was adapted to calculate weighted corre- 
lation and covariance matrices in which information that 
originated from a similar market state is weighted higher. 

We analyzed the resulting similarity weighted estima- 
tors in a simulation study and applied it to a mean- 
variance portfolio optimization in a historical study. 
The results show that our method reduces the portfo- 
lio volatility as well as it enhances the realized return 
compared to the use of unweighted correlations. The ap- 
plication of similarity weighted estimators is especially 
advantageous in periods in which the market structure 
changes drastically. 




1^"- ^eP" 
(b)28 day evaluation 




^ u ^ tS* 

(c)56 day evaluation 



FIG. 5. Average realized return and realized volatility (risk) 
for a target return of 5% above portfolio drift over 10 portfolio 
constellations. The unweighted correlation matrix (300 days 
moving window) is compared to a similarity weighted correla- 
tion matrix using the similarity measure £ 50 . The results are 
compared to a naive portfolio as a reference. 
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